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ABSTRACT 


Health and Usage Monitoring Systems (HUMS) is an emerging technology in 
helicopter aviation. The United States Navy is evaluating its viability for use on its 
helicopter fleet. HUMS uses sensors placed throughout the helicopter to monitor and 
record vibration signals and numerous other aircraft operating parameters. This thesis 
evaluates the vibration signals recorded by a HUMS system using a Statistical technique 
called tree-structured classification. The goal of the analysis is to demonstrate the 
technique’s ability to predict the presence of faulted components in the transmission of the 
SH-60B autonomously operated in a Helicopter Transmission Test Facility at Naval Air 
Warfare Center, Trenton, New Jersey. The analysis is implemented in the statistical 


software package S-plus (Mathsoft Inc., 1995). 
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EXECUTIVE SUMMARY 


The United States Navy is currently evaluating a technology called Health and 
Usage Monitoring Systems (HUMS) which should prove capable of improving helicopter 
safety and reliability. It uses airframe-mounted sensors to monitor and record vibrations, 
flight control positions, and other parameters; these sensors are used to display 
information to the aviator and the ground maintenance crew regarding the aircraft 
operation, usage, and health. The HUMS system is being tested at Naval Air Warfare 
Center (NAWC), Trenton, New Jersey. It is mounted on a full scale SH-60 power drive 
system test bed called the Helicopter Transmission Test Facility (HTTF). 

The HTTF can accommodate 32 accelerometers that collect raw vibration data for 
each data acquisition. In a single acquisition, it collects raw data from every available 
accelerometer in the system. The resulting raw data is processed by proprietary algorithms 
of the B.F. Goodrich Company. These algorithms, developed under contract are believed 
to give indications of faults in components of the helicopter power drive system. The 
outputs from these algorithms are ‘indicators’ that in some cases should characterize the 
location of a component with a fault and the type of fault that it has experienced. 

The HTTF in Trenton has been operating with the intent of building a database of 
“vibration signatures” for various component failures in the drive train. Data has been 
collected from the HTTF using components that were all believed to be good in order to 
establish a baseline vibration signature for each component. In addition, extensive “seeded 
fault” testing has been accomplished. This means that defective components are placed in 
the transmission so as to observe their behavior. 

The challenge of interpreting the data provided by this HUMS system is to 
determine which, if any, components are faulty. Tree-structured classification is a 
statistical method that provides a means of interpreting this data. The technique is 
analogous to normal or generalized linear regression in that it attempts to predict the value 


of a dependent variable based on the value of a set of independent variables. 


This thesis uses data from the input pinion in the intermediate gearbox of the 
HTTF and develops models using tree-structured classification to predict its physical 
condition. The data was acquired from two sensors physically located near the gear of 
interest. These models predict accurately within the confines of the available data. Their 
ability to predict beyond the data, however, may be marginal. This is not unexpected and 
does not imply a flaw inthe methodology. It is more a problem of having relevant data to 
which to apply the method. This is demonstrated in the thesis by applying data from an 
operational aircraft to the models developed from the HTTF data. 

Currently, the NAWC Trenton HTTF is the best source of data for applying this 
method and developing models to predict failure in aircraft components. The ability to 
insert faulted components into an operational transmission enables NAWC Trenton to 
develop and maintain a rich data set for tree-structured classification. A better source of 
data would obviously be data from the aircraft itself. Although data is available from the 
aircraft, it is of little value in characterizing the structure of faulted components. For 
obvious reasons, little data exists in which a faulted component is flown in an operational 
aircraft. Without such data, models that accurately differentiate between good and faulted 
parts may be difficult to develop. 

Further research is necessary to fully investigate the usefulness of tree-structured 
classification in HUMS. Analysis similar to the type done in this thesis should be done on 
numerous other gears, bearings and shafts in the HTTF. The models developed through 
this research will help determine the usefulness of this type of analysis to HUMS. 

This thesis demonstrates the usefulness of tree-structured classification in HUMS 
research. Still much needs to be done to prove its ability to accurately predict faults in 
operational aircraft. Since HUMS is in its infancy, it is reasonable to believe that 
methodology similar to that contained in this thesis will assist in its development and 


implementation. 


I. INTRODUCTION 


The United States Navy is currently evaluating a technology which should prove 
capable of improving helicopter safety and reliability. This technology, widely known as 
Health and Usage Monitoring Systems (HUMS), has been tested and implemented in the 
United Kingdom for use in helicopter operations in the North Sea. The United States 
Navy 1s developing HUMS to increase safety of aircraft operation and improve the 
efficiency of maintenance. 

HUMS technology uses airframe-mounted sensors to monitor and record 
vibrations, flight control positions, and other parameters; these are used to display 
information to the aviator and the ground maintenance crew regarding the aircraft 
operation, usage, and health. Concurrent testing is being conducted at Helicopter Anti- 
Submarine Squadron, Light - 41 (HSL-41) at Naval Air Station (NAS) North Island, 
California, Naval Air Warfare Center (NAWC), Trenton, New Jersey and NAS Patuxent 
River, Maryland. 

The debate in the development of an emerging technology centers around 
contrasting the benefits of the system with its costs and limitations. A discussion of some 
benefits and limitations will serve to introduce HUMS and its potential usefulness to the 
United States Navy. 

A. BENEFITS OF HUMS 


The ultimate goal of HUMS is to provide improved information regarding the 
health and usage of an aircraft, which may provide extraordinary improvements 1n aircraft 
safety and maintenance. In addition to fiscal savings, HUMS may dramatically increase 
the operational capabilities of an aviation unit through increased and predictable aircraft 
availability and survivability. 


1. Safety 


Safety is a primary consideration in evaluating the benefits of a system that 


provides this information concerning the health of an aircraft. All aircraft mishaps are 


evaluated based on five possible causal factors; supervisory, aircrew, facilities, material, 
and maintenance. Of all class “A” ! mishaps occurring dumning fiscal years 1991 through 
1995, thirty-two percent had material as a causal factor, and sixteen percent had 
maintenance as a causal factor [Ref. 1]. Both of these areas are targeted for improvement 
with the implementation of HUMS. 

If accurate HUMS information were available, an aircraft would never be flown 
with a potentially hazardous condition. In March of 1996, an AH-1W experienced a tail 
rotor failure and crashed, killing both pilots. The maintenance records revealed damage to 
the tail rotor during an earlier towing evolution on the flight line. The tail rotor and 
trunnion were removed and replaced, but the yoke was only visually inspected for damage. 
The inspection revealed no damage, but after the mishap it was hypothesized that it had 
experienced a stress nisor during the towing incident. It was presumed that this weakness 
in the yoke eventually gave way to failure, causing the mishap. [Ref. 2] This is a dramatic 
example of the type of failure that should be detected by a health monitoring system. 

This mishap might have been avoided with a reliable HUMS. The value of human 
life and the increase in effectiveness of a crew confident in its aircraft combine to intensify 
the value of HUMS. This, along with the cost of replacing airframes, aircrew, and the 
savings from fewer mishap investigations all combine to make the issue of safety a pnme 
motivation in developing a reliable HUMS system. 


2. Maintenance 


Another source of potential savings is the improved capabilities of maintenance 
personnel furnished with HUMS information. Currently, critical components are 
inspected, removed, or replaced according to a time schedule usually based on the flight 
hours they have experienced. This time schedule is understandably very conservative, 
ensuring that the “weakest” component of any lot will be replaced prior to its failure. This 


method results in countless removals and replacements of perfectly good components. 


! A class “A” mishap is one that results in fatality, aircraft destruction, or damage over $1,000,000 
[Ref. 3]. 


Many safeguards are in place to ensure the correctness and completeness of 
maintenance performed on Navy and Marine Corps helicopters. For critical component 
maintenance, an action is performed, inspected and checked for quality assurance. This 
process requires a minimum of three individuals. Once the maintenance action is 
performed and inspected, the paperwork must be reviewed by Maintenance Control, a 
“safe-for-flight” authority, and finally, the pilot. Clearly, with the safeguards integrated 
into the system, maintenance performed on the helicopters is predictably safe. However, 
risk remains every time any maintenance is performed. This risk is manifested in two 
ways. 

First, there is no guarantee of the relative health of the new component. Since the 
original component is being removed based on a time schedule, there is no reliable means 
of determining its health. All that is known is that it was operating when it was removed. 
The new component is likely to be a functioning component, as it has been subjected to 
tests of its own. However, it is occasionally the case that a faulty component is delivered 
from supply. This bad component may be swapped for a perfectly good component at the 
expense of the cost of the component plus many man-hours to perform and inspect the 
maintenance. 

The second manifestation of risk is that of improper maintenance. As discussed 
before, there are necessary inspections and re-inspections of critical component 
maintenance. There remains, however, the risk of error every time maintenance is 
performed. Every time a bolt is tightened, for example, there is a potential for over- 
torquing that bolt. This error may or may not be caught by the quality assurance process. 
Humans make mistakes and this risk factor will never be eliminated. 

With the implementation of a reliable HUMS, only necessary maintenance would 
be performed. This implementation should extend the usable life of many components 
without sacrificing reliability. The savings of fewer component replacements, and the 
reduced risks of only performing maintenance when necessary, are compelling arguments 


illustrating the benefit of HUMS to the maintenance efforts of an aviation unit. 


3. Operational Availability 


The goal of an operational aviation unit 1s to have aircraft available to fly for a 
scheduled operation or in response to any unplanned contingency. HUMS provides the 
means for a unit to effectively accomplish this goal. Ultimately, through improved safety, 
efficient maintenance, and improved logistic support, an aviation unit will be able to meet 
its operational requirements in an efficient, cost-effective manner. 


B. LIMITATIONS OF HUMS 


The realities of the state of HUMS technology is evident in the difficulties 
encountered by the companies in the United Kingdom currently implementing HUMS. In 
the United Kingdom, HUMS systems are employed on helicopters transporting workers to 
and from oil platforms in the North Sea. Two of the difficulties encountered as HUMS is 
implemented are data quality and false alarms. The success in United Kingdom HUMS 
usage has been not in its technical performance, but rather in public relations. In some 
instances, “HUMS” is painted on the side of helicopters incorporating the system to 
reassure the passengers of the safety and reliability of the aircraft. Though the passengers 
feel safe, due to these difficulties, the true margin of safety benefit may be negligible. 


{Ref. 4] 
1. Data Quality 


The strength of HUMS is its ability to acquire data and use it to determine the 
health of critical components. The confidence in the system can only be as high as the 
confidence in the quality of the data. The data collected by HUMS ranges from vibrations 
of individual gears, shafts, and bearings in the transmission to the positions of the flight 
controls in the cockpit. The integrity of the data relies on the maintenance level of 
accelerometers, flight position indicators, and many other HUMS components including 
hundreds of feet of cabling. The dependence on data quality begins in the developmental 
stages of the technology, and extends to its implementation. 

In the developmental stages of the technology, the quality of the data determines 


the quality of the technology itself. If the technology is developed around poor data, then 


it will perform poorly. This idea extends to the implementation of HUMS. The quality of 
the data that is acquired in the implementation of HUMS must be maintained. The 
reliability of an operational HUMS depends on the quality of the data. 

Along with the issue of data quality comes the question of data maintenance. In 
evaluating the health of certain components, HUMS makes a determination in one of two 
ways. The data for the component may exceed a defined limit called a threshold, or it 
might exceed a limit based on its trends. In order for this trending capability to be 
effective, the data for each specific component must be archived and carried along with it 
as it is removed and replaced on the same or another aircraft. Each critical component, as 
well as each aircraft, must maintain its own database for HUMS to be effective. Vibration 
and rotor analysis, as being developed in HUMS, is complicated and its implementation 
must be carefully planned and monitored. [Ref. 4] 


2. Errors 


The most notable shortcoming of the United Kingdom HUMS system is the 
propensity for erroneous indications. There are several types of errors that can occur in a 
HUMS system. The most obvious are the false positive indication (false alarm) and the 
false negative indication. A false alarm occurs when HUMS indicates that a healthy 
component has experienced some sort of fault. The false negative is a more dangerous 
error in that HUMS fails to give warning in the case of a faulty component. 


a. False Alarms 


It is not uncommon in United Kingdom companies using HUMS equipped 
helicopters to have eighty percent or more of the fleet in exceedance of a HUMS 
threshold, indicating that those aircraft are not flight ready [Ref. 4]. These threshold 
values are predetermined limits set on specific components monitored by HUMS. That 
eighty percent of the fleet that is in exceedance normally does not have any faulted 
components. Instead, the cause of the exceedance is that a conservatively low threshold 
value was set. This problem puts the United Kingdom oil companies in a situation where 


decisions must be made concerning the safety of their aircraft. They must either ignore the 


HUMS indications and fly their aircraft under the exceedance, or they must endure 
excessive maintenance demands and reduced operational availability due to the required 
inspections. In either case, HUMS is burdening the helicopter operations by either 
reducing confidence in the aircraft, or requiring excessive maintenance and inspections. 

There are several causes of the excessive false alarm rate. The most 
obvious, and the one with the most potential for corrective action, is the setting of the 
thresholds. The question of where a threshold should be set is a central issue of debate in 
HUMS development. A threshold is a value set for a specific component of the aircraft 
that is monitored by a HUMS sensor. The HUMS sensor takes a reading from the 
component and compares the value of the reading to the threshold value. If it exceeds the 
threshold, the component is flagged as faulty. The challenge is to set the threshold value 
low enough that if a component is faulty, it will be detected, but high enough to avoid 
flagging good components as faulty. 


b. False Negative Indication 


A false negative indication is when HUMS gives no waming of a fault 
when there actually is a fault present. Setting the threshold value appropriately is a major 
consideration in eliminating the false negative indication error. This error is the more 
dangerous of the two types of errors discussed. Detecting and warning of faulted 
components is the basis for HUMS development. If this type of error is not manageable, 
then the concept of HUMS is not worth pursuing 
C. SCOPE OF THESIS 


This thesis will focus on analyzing the data from a developmental HUMS at 
NAWC, Trenton; Chapter II will describe this system. Chapter III will describe 
Classification Trees, a non-parametric technique used to uncover structure in a data set. It 
will also discuss specifically how the data acquired from a helicopter transmission test bed 
is modeled using this technique. Chapter IV will present the results of the analysis and 
describe the specific models used. Using the models and their output, Chapter V will 


discuss their possible usefulness and areas of further study. 


i. BACKGROUND 


A. HIDS DESCRIPTION 


The system being tested at NAWC, Trenton is a HUMS called Helicopter 
Integrated Diagnostic System (HIDS). HIDS testing uses a test bed with a full scale 
Helicopter Transmission Test Facility (HTTF) consisting of the entire SH-60 power drive 
system. 

The HTTF can accommodate up to 32 accelerometers that simultaneously sense 
the vibration signals of all the components that are “near” at a rate of 100,000 samples per 
second. In this context, “near” means that the accelerometer can detect the signal of any 
component that has an accessible path from which vibration signals can be sensed. A 
single component may be “near” more than one accelerometer. [Ref. 5] 


B. AVAILABLE DATA 


The accelerometers collect raw vibration data for up to thirty seconds per 
acquisition. In a single acquisition, HIDS will collect data from every available 
accelerometer in the system. Most acquisitions require between four and ten seconds to 
record a complete vibration signature from all of the monitored components. 

In the Trenton HTTF, six data sets are usually acquired per test run. The first 1s 
with cold oil at low power settings. The second is with hot oil at the maximum power 
setting. The remaining four data sets are acquired with hot oil varying the power setting 
evenly between maximum and minimum. Ambient cell temperature can also be varied 
between zero and forty degrees Celsius. [Ref. 5] 

The resulting raw data is processed by proprietary algorithms of the B.F. Goodrich 
Company. These algorithms, developed under contract, are believed to give indications of 


faults in components of the helicopter power drive system. 


C. INDICATORS 


The outputs from these algorithms are “indicators” that in some cases should 
characterize the failing component and the type of fault that the component has 
experienced. These indicators are proprietary in nature and will not be discussed in detail 
in this thesis. In general terms, the indicators compute statistical measures from the raw 
data describing certain characteristics of the vibration signal and various types of energy 
emitted from the component. 

Components of the power train are categorized into three separate classes: gears, 
shafts, and bearings. A different set of indicators is computed and recorded for each type 
of component. For example, gears have associated with them one set of computed 
indicators, while shafts and bearings have different sets of indicators associated with them. 
An example of an indicator is “roller bearing energy.” This indicator is computed for 
each component at every sensor that can “see” that component. In other words, roller 
bearing energy is computed for a single component every time it is detected by a sensor. 
For a single acquisition, the roller bearing energy of a component is recorded the same 
number of times as there are sensors that “see” it. 


D. DATA COLLECTION 


The indicator data has been provided in Matlab format. Each acquisition results in 
three Matlab matrices, one each for gears, bearings, and shafts. The matrices contain the 
computed indicators for each component/sensor combination that maintains a path of 
transmissibility. From these matrices, any indicator from any component/sensor 
combination can be isolated and evaluated. 

The HTTF in Trenton has been operating with the intent of building a database of 
vibration signatures for various component failures in the drive train. There are currently 
over 900 data acquisitions, some lasting up to 30 seconds, but in most cases lasting 
between four and ten seconds. Data has been collected from the HT TF using components 
that were all believed to be good in order to establish a baseline vibration signature for 


each component. In addition, extensive “seeded fault” testing has been accomplished. 


This means that defective components are placed in the transmission to observe their 
behavior. The HTTF employs defective components of two distinct types. 

The first type of failure is the fleet reyected component failure. These components 
have faults discovered during routine organizational, intermediate, or depot-level 
maintenance. They are delivered to NAWC, Trenton for evaluation on the HTTF and then 
returned. These components are beneficial for demonstrating the characteristics of failures 
actually occurring in fleet aircraft. The limited availability of these components demands 
an alternate source of component failure for evaluation on the HT TF. 

The second type of failure fulfills this requirement. These failures are the result of 
intentionally damaging otherwise good components. An example of this would be 
removing a portion of a tooth from a particular gear. These components are easily 
attainable and since they do not have to be returned, are available for extensive analysis. 
They provide the experimenters with the flexibility to focus their analysis in an organized 
way. The realism of using components damaged in operational aircraft is sacrificed in 
order to attain the convenience and flexibility that this type of component failure provides. 

In order to achieve the goals established for HUMS, the data collected from a 
system like HIDS must provide definitive solutions to the problem of determining the 
health of components in the SH-60 power drive system. Simply stated, the challenge of 


interpreting the data provided by HIDS is to determine which, if any, component is faulty. 





Iii. ANALYSIS 


A. METHODOLOGY 
1. Overview of Tree-Structured Classification 


Tree-structured classification is a statistical method that builds classification trees 
to uncover structure in a data set. It is an exploratory technique that is analogous to 
normal or generalized linear regression in that it attempts to predict the value of a 
dependent variable based on the value of a set of independent variables. If the dependent 
variable in the data set of interest 1s categorical, the tree grown by this method is called a 
classification tree. If the dependent variable is continuous, then the tree is called a 
regression tree. 

The advantages of tree-structured classification over more familiar regression 
techniques are its ease of interpretation, its ability to handle multiple responses, and its 
ability to handle a mix of categorical and continuous independent variables. There are 
other advantages which make this technique a flexible alternative to regression. Because it 
is a non-parametric technique, the assumptions that must be made about the data are 
reduced and the applicability of the model is generalized. It is insensitive to monotone 
transformations of the independent vanables. This eliminates the exploratory attempts to 
improve the model by transforming the independent variables. [Ref. 6] 


2. Medical Example 


Tree-structured classification is useful in the medical profession for identifying 
patients who are at high risk of death. By way of introduction to tree-structured 
classification, a medical example adapted from Breiman ef al. (1984) is presented. 

Patients who enter a hospital following a heart attack exhibit a wide range of 
variability in their propensity for recovery. A physician, with knowledge about what 
characteristics influence a patient’s ability to recover, is able to allocate the proper 
resources to those patients who are at higher risk of death. The data set used in this 


example consists of 215 patients who checked into a hospital following a heart attack and 


1] 


survived more than 24 hours. Of these 215 patients, 37 died within 30 days of admission, 
and 178 did not. The 178 “survivors” are called class “live” and the 37 “early deaths” are 
called class “die.” 

In tree-structured classification, each data point is called a “case.” In this example 
each patient represents a different case who falls into either class “live” or “die.” They 
also exhibit certain characteristics that a physician hopes will predict their likelihood of 
surviving at least 30 days after admission to the hospital. These characteristics are the 
independent variables used by the tree-structured classification. 

For the example, the variables have been limited to those that have been shown to 
characterize this longevity. The first variable associated with each patient is the minimum 
systolic blood pressure over the 24-hour period following admission to the hospital. This 
is a continuous vanable ranging over all possible blood pressure measurements. The 
second variable is the patient’s age. This is a continuous variable measured in years. The 
final variable is the presence of sinus tachycardia. This is a categorical variable with levels 
of “true” and “false.” By definition, sinus tachycardia is present if the sinus node heart 
rate exceeds 100 beats per minute during the first 24 hours following admission to the 
hospital; the sinus node ts the normal electrical pacemaker of the heart and 1s located in 
the nght atnum. [Ref. 7] 

Tree-structured classification is an iterative procedure that attempts to separate all 
the cases of a data set into nodes of a binary tree that are “pure.” By definition, “pure” 
means that all the cases in a single node have exactly the same realization in the dependent 
variable. In the medical example, a “pure” node would be one where all the patients in 
that node either survived at least 30 days, or all died within 30 days. 

The root node of this binary classification tree contains all the cases in the data set. 
From this node, a determination is made regarding a split of the data into two separate 
“child” nodes. At each node the tree algorithm searches through / independent variables 
one by one, beginning with x, and continuing up to x,,. For our example, / = 3 and 
x, = “systolic pressure,” x, = “age,” and x, = “tachycardia.” At each variable it 


evaluates the change in purity (in a sense to be discussed later) if all the cases in that node 
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were split based on each possible value of that variable. A split is chosen at a specific 


value, 7, of a single independent variable, x;. The right child node gets all cases for which 
x; > J and the left child node gets all cases for which x; <j. Considering the data at the 


root node of our medical example, the algorithm evaluates every possible split of the 
cases, and picks the split that gives the greatest improvement in purity. It first checks the 
systolic blood pressure variable. It evaluates the change in purity for splits made between 
distinct values of systolic blood pressure observed in the data set. It then does the same 
for the splits made between distinct values of observed age. Finally, it looks at the 
presence of sinus tachycardia. It evaluates the change in purity if a split were made 
between the cases where sinus tachycardia was present, and those where it was not. From 
all the possible splits, the algorithm chooses the one that gives the greatest improvement in 
purity. [Ref. 7] 

The splitting rule implemented in S-plus (Mathsoft Inc., 1995) departs slightly 
from the recursive partitioning methods discussed in Breiman et al. (1984). S-plus uses 
the deviance (likelihood statistic) to measure the “purity” of the node. Every node has a 
measure of impurity called deviance. At each node 7 of a classification tree, the vector 


HM, = (P,,,---> Pz) 1s the probability distribution over the k classes. Each case in node 7 is 


assumed to be drawn from a multinomial distribution with parameter 4. At node /, 


n., cases are observed in class k, where > Ny, =n,. The deviance at a node is defined as 
k 


the negative of twice the log-likelihood, 
b= -29 My log Px - 
k 
Since we do not know the probabilities, we must estimate 4. for node /, 
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Now, consider splitting the cases from node i into two child nodes / and r. The split 


would be made such that the decrease in deviance of the node, 


AD =D,-D,-D, 
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is maximized. (since a decrease in deviance means an increase in purity) [Ref. 8] 

Using the data from the medical example, we compute the deviance of the root 
node as an illustration. As previously stated, there are two classes of patients, “live” or 
“die.” Thus, each case in the root node is assumed to be drawn from a multinomial 


distribution with k= 2. If 4, =(p,, p,.), then p,, = prob{'live'} and p,, = prob{'die'}. 


At the root node, there are a total of m,=215 cases, n,, =178 with level “live” and 


oi. pa eS n oF, 
i, =37 with level “die,” giving p,, = a8 and p,, = AG: and the deviance at the root 
node is equal to 
178 a7) 
— 2[178 n+ 37 ln =] = 1974S. 
2s NS 


The first split of the cases in the example is made on systolic pressure. The split is 
made such that all the cases with systolic pressure less than 92.5 go to the left child node 
and all the cases with systolic pressure greater than 92.5 go to the nght child node. The 


split results in 7, =20 cases in the left node and n, =195 cases in the nght node. Of the 
20 cases in the left node, n,, =6 have the level “live” and n,, =14 have the level “die.” 
Of the 195 cases in the right node, ”,,=172 have the level “live” and n,, = 23 have the 


level “die.” The resultant deviance is the sum of the deviance of the two child nodes, 


ne 


6 14 23 We 
— 2[/6In— + 14In—]- 2[23 ln — 4+ 172ln—]= 165.93, 
20 20 95 


195 1 
which is smaller than the deviance of the root node (and is the smallest possible across all 
possible splits). 

Each split of a node results in a tree which is more pure in the dependent variable. 


The purity of the tree is defined by the deviance of the tree, 
Da PIO) 
A 


where / 1s the set of all nodes on which splits have not yet been made. This set of nodes 
iscalled the “leaf nodes.” A “terminal node” is a leaf node on which no further splits are 


made. [Ref. 8] 


If a tree is allowed to grow until each terminal node contains only one case, then it 
has a total deviance of zero, perfectly characterizing the structure of the data. This tree, 
however, may be worthless for predicting the classification of new data not found in the 
data set used to grow the tree, analogous to the regression situation of using 7 data points 
to fit a linear model with 7 unknown coefficients. 

A set of stopping criteria is in place to ensure that over-fitting of the data is not 
carried to this extreme. Even though an over-sized tree may be useless for predicting new 
data, the tree must be allowed to grow sufficiently large to uncover all relevant structure. 
Failure to grow the tree sufficiently may leave significant structure uncovered. The idea is 
to grow the tree larger than desired and then “prune” it back to one that is useful in 
predicting classifications of new data. Figure 1 is the over-sized tree grown from the 
medical data prior to any pruning. 

The interpretation of the tree graph is relatively simple. Each node is labeled with 
the level of the dependent variable that characterizes the majority of the cases in that node. 
For instance, since 178 of the 215 patients did live at least thirty days, the root node of 
figure one has the label “live.” This indicates that the majority of the patients in that node 
had the level “live” as their dependent variable. 

Below each terminal node of the graph is the misclassification rate of the cases in 
that node with respect to its node label. For instance, the root node is labeled “live,” but, 
in fact, 37 of the 215 cases in the root node actually died within the first 30 days. 
Therefore, the misclassification rate under the root node reads 37/215. 

The labels on the arcs of the tree is the variable on which the split of the cases was 
made. The first split of the cases occurred on systolic pressure. All those who had 
systolic pressure less than 92.5 were split into the left node, and all those who had systolic 


pressure greater than 92.5 were split into the nght node’. 


1 The comparison of an independent variable is always evaluated as greater than or less than the value 
chosen to split the data. The implementation of classification trees always chooses candidate splits of an 
independent variable between distinct values of the individual cases. There is no possibility of an 
independent variable having a value equal to a value of its candidate split. For example, if there was a 
patient with systolic blood pressure of 92.5, then a different splitting value would have been chosen. 


[Ref.6] 
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Figure 1. Over-Sized Tree Grown for Medical Example 
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The terminal nodes of the tree are represented by square boxes. These terminal 
nodes are labeled with the level of the dependent variable corresponding to the level of the 
majority of the cases in that node. Ideally, all the cases in a terminal node would have a 


misclassification rate of zero. For a “pure” node containing 7 cases either all 7 are “live” 


n 


or all 7 are “die.” The likelihood function p’p., = p.and p., = “=1 so the deviance is 
n 


—2[nlnl]=0. Since real data rarely behaves ideally, growing a tree where all terminal 
nodes are pure is uncommon. 

Methods are provided by S-Plus to reduce the size of the tree to the “right size.” 
The method used to determine the “nght size” is called cross-validation, and will be 
discussed later in this chapter. The method provided to implement cross-validation is 
called “pruning.” This method takes a tree model as required input, and reduces it in size 
according to a cost-complexity parameter that may be changed by the user. 

The output of the pruning method implemented in S-plus is either a single pruned 
tree if the cost-complexity parameter is given, or a series of pruned trees based on a 
sequence of cost-complexity parameters. This series of pruned trees is what the cross- 
validation method uses to determine the nght-sized tree. 

The pruning method determines the deviance (or impurity) of the trees ranging in 
size from the over-sized tree, to the tree consisting of only the root node. The deviance in 
the pruning method is actually the sum of the deviance of the tree plus a weighted penalty 
for the size of the tree, which is the number of terminal nodes of the tree; the weight is 
called the cost-complexity parameter. It is intuitive that as the size of the tree increases, 
the purity of the tree will also increase. Figure 2 shows the results from pruning the full 


tree in the medical example. 
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Figure 2. Pruning Sequence for Medical Example 


There is a point in the process where the benefit of increased purity is countered by 
a tree’s inability to accurately predict the response of cases not used to actually grow the 
tree. Cross-validation is a way of determining the size of tree that optimizes both the 
purity of the tree and its ability to predict from new data. 

Cross-validation uses pruning to determine the “nght-sized” tree. If the data set is 
sufficiently large, then part of the data can be used to grow the tree, and the remaining 
data used to check for the tree’s ability to accurately classify it. Cross-validation is a 
method used in the case where the size of the data set is not large enough to hold back 
data in order to check for its predictive accuracy. 

Ten-fold cross validation takes the complete data set and partitions it into ten 
nearly equal sets. Each set is removed in turn; then the remaining nine tenths are used to 
grow an over-sized tree. The over-sized tree is pruned as previously discussed, resulting 


in a sequence of pruned trees similar to Figure 2. The one-tenth of the data that was 
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removed prior to growing the tree is then applied to that specific sequence of pruned trees 
to test its predictive accuracy. The deviance from the cases applied to each of the pruned 
trees in the sequence is recorded. 

The procedure is performed nine more times for each of the unique partitions of 
the data set. When this is finished, there are ten deviances recorded for each size in the 
sequence of pruned tree. Cross-validation plots the minimum deviance from all ten trees 
at each size in the sequence. In general, as the size of a tree increases, the deviance also 
decreases, until a point at which the size of the tree is so large that it loses its predictive 
ability. This minimum point of deviance is the determination of the “nght-sized” tree. 


Figure 3 is a plot of the ten-fold cross-validation for the medical example. 
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Figure 3. Cross-validation plot for Medical Example 


Cross-validation gives the best size for a tree based on the given data. From this 
information, a tree is grown from the entire data set and pruned back to the appropriate 
size. This tree becomes the model from which exploration of the structure of the data can 


begin. Figure 3 clearly shows that a tree of five or six nodes is the appropriate size for this 
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set of data, since the deviance reaches its minimum at these points. Figure 4 is the plot of 
a tree that has been fully grown and then pruned back to a five node tree, based on the 


results of the ten-fold cross-validation. 
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Figure 4. Tree Grown From Medical Data Pruned to Five Nodes 
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B. CLASSIFYING HTTF DATA 


For the analysis in this thesis, data is taken from a single component of the NAWC, 
Trenton HTTF: the input pinion in the intermediate gearbox. The input pinion is a gear in 
the intermediate gearbox that accepts rotational power from the main gearbox in the 
transmission and redirects it toward the tail gearbox. The question being asked about the 
data acquired from the HTTF is, “Can a fault be identified in a component of the 
transmission, based on the indicators computed from the raw data?” This is analogous to 
the medical planners who wanted to know if the probability of survival of their heart 
attack victims could be predicted by the victim’s age, systolic blood pressure, and the 
presence of sinus tachycardia. The dependent variable in this case is the physical state of 
the input pinion. It 1s a categorical variable with levels or classes consisting of all possible 
conditions of that component. The independent variables are the indicators, as discussed in 
Chapter II, computed for the vibrations detected by each sensor able to see the input 
pinion. Out of all the acquisitions taken from the HTTF, 640 were available for this 
analysis. These acquisitions were taken from 1 December 1994 to 3 January 1997. 

The dependent variable is a single variable with up to four levels. Of the 640 
acquisitions, 396 had no faults in the intermediate gear box input pinion. These 
acquisitions are variables of the first level called “no fault,” and are considered to be the 
baseline data. The second level is “edm fault.” An edm fault is a machined slit made in a 
tooth of the pinion. Of the remaining 244 acquisitions, 186 had this fault. The purpose of 
the edm is to propagate a crack in the gear from the weakness in that area of the tooth. 
The input pinion was not responding to a single edm slit, so three slits were made to 
intensify the progress. Another 36 acquisitions had this fault and are variables with the 
third level “edmthree fault.” The fourth possible level for the dependent variable is “tooth 
fault.” This fault is caused by intentionally removing one-third of a tooth in the input 
pinion. There are 22 acquisitions with this fault. All of these faults are a result of 
intentionally corrupting the otherwise flight-ready component. 

Two accelerometers are attached to the intermediate gearbox that act as vibration 


sensors for the input pinion. For each sensor, 38 indicators are computed for the vibration 
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signals received from the single input pinion. These 76 indicators are all included as 
independent variables in the analysis. 

Four other parameters were measured and used as independent variables. During 
some of the data acquisitions, the HTTF was operating only one engine. This is recorded 
and used as a categorical independent variable with three levels (both operating, starboard 
Operating, or port operating). Another independent variable is the time between data 
acquisitions which provides information about the temperature of the system oil. Finally, 
the last two independent variables are the values of the main and the tail rotor torque. 
These give an indication of the power applied to the system. When the tail rotor is not 
active, a tail rotor torque of zero is recorded. Even though the gears are spinning, there is 
no torque applied to the tail rotor transmission. Since the intermediate gear box transfers 
power from the main gear box to the tail rotor gear box, the implications of zero torque 
on the tail rotor are significant. In all, this gives 80 independent variables for the single 
categorical dependent variable. Appendix A contains a sample set of the data used. 

The goal of the tree-based classification model is to predict the physical state of 
the intermediate gear box input pinion based on the independent variables. Several 
different models may be developed from the same data set. After determining the possible 


models, a determination of the “best” model must be made and subsequently interpreted. 
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IV. RESULTS 


The applicability of tree-structured classification to HUMS research is dependent 
on the way the data set is structured with regards to its dependent variable. The data can 
be configured in several different ways depending on what structure needs to be 
uncovered in the analysis. For instance, the data contains four different states defined as 
the classes. Each class corresponds to the physical state of the component of interest 
during a particular acquisition or case. Since the goal of the study at NAWC Trenton is to 
determine if faults can be detected, then it is reasonable to assume that each of the states 
that correspond to any type of fault could be aggregated into a single state called “fault.” 
All of the baseline data would fall into a second state called “no fault.” 

Other possibilities exist in defining the state variables. While the previous example 
determined the presence of any fault, a second approach 1s to determine the presence of 
each type of fault known to be present in the data set. In the case of the data obtained for 
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the input pinion, a dependent variable is defined as either “no,” “edm,” “edmthree,” or 
“tooth.” This type of analysis adds another level of error not present in the previous 
“fault” / “no fault” example. This structure of the dependent variable is subject to three 
types of errors. As discussed in Chapter I, the first two error types are the false positives 
and the false negatives. A third type of error introduced with this structure is the error of 
fault misclassification. These errors occur when the model classifies a case as one type of 
fault when in fact it is a different type of fault. Although this is an error, it 1s the least 
costly error assuming that the two faults have similar impact on the operational capability 
of the aircraft. 

The research in this thesis focuses on these two structures of the dependent 
variable. Model one defines the dependent variable as a factor with four levels. It 
attempts to distinguish each type of fault present as well as those that are not faulted. 


Model two simplifies the definition of the dependent variable into “fault” or “no fault.” 


23 


This approach eliminates the possibility of misclassifying a fault of one type as a fault of a 
different type. 
A. MODEL DEVELOPMENT 


1. Cross-Validation 


Models were developed using the methods described in Chapter III. After 
determining the target size of the trees based on a ten-fold cross-validation procedure, two 
separate trees were grown. The tree for model one was grown and then pruned back to 
the best eleven terminal nodes. From the 640 cases presented to the model, a total of 23 
errors were made. There were 16 missed faults, 7 false alarms, and no fault 
misclassifications. This tree is depicted in Figure 5. 

The tree for model two was grown and then pruned back to the best twelve 
terminal nodes. From the 640 cases presented to the model, a total of 20 errors were 
made. There were 13 missed faults and 7 false alarms. This tree is depicted in Figure 6. 
Appendix B contains detailed S-plus output from all the tree models developed. Table 1 


summarizes the trees developed using cross-validation. 


MODEL 1: Dependent Variable: “Fault,” “EDM,” “EDMTHREE,” “Tooth” 
MODEL 2: Dependent Variable: “Fault,” “No Fault” 


Overall 
Model | Misclassification Rate | Missed Faults | False Alarms of Faults 


Table 1. Summary of Trees from Cross- Validation 







Misclassification 












In analyzing the two trees, it was discovered that they were both sensitive to the 
data used to build them. For instance, a tree grown using a random ninety percent sample 
of the data could significantly vary from a tree grown from a different sample of the same 


size. If more than one tree can be built describing the same set of data, then there must be 
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Figure 5. Model One Tree Pruned to Eleven Nodes 
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Figure 6. Model Two Tree Pruned to Twelve Nodes 
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one that is better than the other. It is not clear that the tree grown and pruned back to the 
size suggested by cross-validation necessarily results in the “best” tree. The “best” tree is 
one that has a small misclassification rate, while maintaining a small error rate in predicting 
data not used to grow the tree. 


2. Heuristic Method 


To determine the stability of the tree models, a heuristic method was developed 
using functions available in S-plus. The heuristic method simply builds multiple tree 
models using various configurations of the data. From the different models, a best tree is 
determined. The code used to implement this method is found in Appendix C. 

The procedures for the heuristic method are simple. A random sample is taken 
from each level of the dependent vanable. For model one, a random sample was taken 
from the levels corresponding to each type of fault. For model two, a random sample was 
taken from the levels corresponding to “fault” or “no fault.” Initially, this was a random 
sample consisting of half of the data in each level of the dependent variable. Using model 
two as an example, a random sample of 198 cases from the 396 “no fault” cases, and 122 
cases from the 244 “fault” cases was drawn. From these 320 cases, a tree was grown and 
then pruned back to the size suggested by a two-fold cross-validation, since only half of 
the data is used. In the case of model two, this target size is eight terminal nodes. The 
remaining 320 cases not used to build the tree were applied to the model which resulted in 
a prediction misclassification rate. 

Two methods were used to determine the “best” tree. The first was to simply use 
the misclassification rate from the remaining cases applied to the tree. This is called the 
prediction misclassification rate (PMR). The PMR is attained by applying the cases held 
out from the building of the tree to the model. Each of these cases falls into a terminal 
node based on its own independent variables. The PMR counts the total number 
misclassified and divides it by the total number of cases applied. The tree with the smallest 


PMR was kept as a candidate for the best tree. 
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The second method considered the misclassification rate of the tree itself. This 
misclassification rate, called the tree misclassification rate (TMR), is the misclassification 
rate of those cases used to build the tree. Unless a tree is allowed to grow until all the 
terminal nodes are pure, the TMR will always be greater than zero. The second method 
took the average of the TMR and the PMR. The tree with the smallest average of the two 
muisclassification rates was also saved as a candidate for the “best” tree. 

In addition to the trees built using half of the data, trees were built using ninety 
percent of the data. The same procedures were followed as the trees built using fifty 
percent of the data. In model two, a random sample of 356 cases from the 396 “no fault” 
cases, and 219 cases from the 244 “fault” cases was drawn. The trees were grown and 
pruned back to the size suggested by a ten-fold cross-validation. In the case of model 
two, this target size is twelve terminal nodes. The remaining ten percent of the data were 
applied to the tree, and the misclassification rates were computed. The same criteria were 
used to determine the “best” tree from the models using the 90/10 split of the data as were 
used for the models using a 50/50 split of the data. 

This method was applied 1000 times for each configuration of the dependent 
variable. For the tree grown during each iteration, two measures of goodness were 
considered. These measures of goodness are the misclassification rate from the predicted 
data, and the average of the misclassification rates from the tree and the predicted data. 
When the 1000 iterations were complete, there were four tree models from each of the 
two configurations of the dependent variable. In all, eight trees were kept in order to 
make an evaluation of the “best” tree for each configuration of the dependent variable. 


B. HEURISTIC MODEL SELECTION 


These eight trees are broken into sets of four for comparison. Each group 
represents the four best trees using a particular separation of the data used to build the 
model. They are further distinguished by the measure used to determine the “best” tree. 


Table 2 summarizes the four trees kept from the data in model one. 
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MODEL 1 
Dependent Variable: “Fault,” “EDM,” “EDMTHREE,” “Tooth” 
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Table 2. Summary of Best Trees from Model One Data 













Misclassification 








As is expected, the variability in the TMR from the trees grown from fifty percent 
of the data is greater than that of those grown from ninety percent of the data. Because 
ninety percent of the data is used for each tree, the best tree is determined using the 
averaging measure or the PMR method. Since only ten percent of the data is held back 
for use in prediction, the trees with the 90/10 split achieve a much smaller PMR. The 
trees found using the 90/10 split are, in fact, the same tree. The tree depicted in Figure 5 
is identical with regard to the variables used to build it. This is reassuring and suggests 
stability in the cross-validation procedure as outlined in Chapter III. 

In selecting the best tree for model one, consideration was given to the relative 
importance of the different types of errors seen by the different trees. If missed faults are 
considered the most undesirable error followed by false alarms and then misclassification 
of faults, then either of the trees grown from the 90/10 split appear to be the best tree for 
model one. The tree is depicted in Figure 7. Even though the tree grown from the 50/50 
split using averaging only has 15 missed faults, the large number of false alarms rule it out 


as the best tree. 
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Figure 7. Model One Tree Selected From Heuristic Method 


30 


The four trees from model two are the result of applying the 1000 iterations to the 


data. They are summarized in table 3. 


MODEL 2 
Dependent Variable: “Fault,” “No Fault” 


Split of Measure 
Data of Goodness | TMR Missed Faults False Alarms 
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Table 3. Summary of Best Trees from Model Two Data 





Again, the variability in the TMR from the trees grown from fifty percent of the data is 
greater than that of those grown from ninety percent of the data. The same relationships 
between the split of the data and the values of TMR and PMR hold for model two. With 
only ten percent of the data held back, it was possible to find trees that perfectly predicted 
that small number of cases. Although the trees found by using the 90/10 split are different 
in this case, they are similar enough to suggest stability in the trees. 

In selecting the best tree for model two, consideration was also given to the 
relative importance of the different types of errors seen by the different trees. Similarly, 
missed faults are considered the most undesirable error followed by false alarms. The tree 
grown from the 90/10 split using averaging as the measure of goodness appears to be the 
best tree for Model Two. The tree is depicted in Figure 8. 

C. MODEL COMPARISON 


The trees grown for the model one data are nearly identical. Figure 5 depicts the 
tree grown by the ten-fold cross-validation. Figure 7 depicts the tree determined “best” by 


the heuristic method. Although the trees are slightly different, the interpretation gives 
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Figure 8. Model Two Tree Selected From Heuristic Method 
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exactly identical results. Each tree misclassifies the 23 cases out of 640 while using 
exactly the same independent variables as the splitting variables. The only difference 
between the two trees 1s the value of the selected split in two instances. 

The two trees for the model two data are also very similar. Figure 6 depicts the 
tree grown by the ten-fold cross-validation. The total misclassification rate of this tree is 
20 cases out of 640. A slight improvement was realized by finding the tree by the 
heuristic method. Figure 8 depicts this tree. The independent variables used as the 
splitting variable are similar, but not in the same order. This similarity shows stability in 
the trees grown using both the cross-validation procedure and the heuristic method. Table 


4 summarizes the comparison of the trees for both data sets. 





MODEL 1: Dependent Variable: “Fault,” “EDM,” “EDMTHREE,” “Tooth” 
MODEL 2: Dependent Variable: “Fault,” “No Fault” 


Overall 
Misclassification | Missed False | Misclassification 
Method Model Rate Faults | Alarms of Faults 


Gossaiaaon | 1 [oe 
OS 
[cossaiaaion | 2 [os [WA 


Table 4. Summary of Trees for Both Data Sets 


















D. MODEL APPLICATION 


Because these models were developed from data in a test cell, their applicability to 
aircraft data is questionable. Twenty-six acquisitions from an SH-60B Seahawk helicopter 
were available to assess the accuracy of the models built from HTTF data to actual aircraft 
data. The data from the helicopter is assumed to be all no-fault data. The prediction tree 


shows the misclassification rate of the twenty-six cases as they are applied to the models. 
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Figure 9 is the prediction tree for the aircraft data applied to model one and Figure 10 is 
the prediction tree for model two. 

Model one does a mediocre job of predicting aircraft data) From the twenty-six 
cases, twenty are classified correctly as “no fault.” Of the remaining six cases, two are 
misclassified as “edm” and four are classified as “edmthree.” This is interesting because in 
the test cell data, the “edmthree” faults were the most distinctive and never gave a false 
alarm or a false negative indication. 

Model two does a much better job of classifying the cases from the aircraft data. 
Only two of the twenty-six are misclassified as a fault. Although this is not an acceptable 
error rate for a HUMS system employed on an operational aircraft, it does demonstrate 


potential utility for tree-structured classification in determining thresholds for HUMS. 
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Figure 9. Model One Prediction Tree From Aircraft Data 
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Figure 10. Model Two Prediction Tree From Aircraft Data 
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V. CONCLUSIONS AND RECOMMENDATIONS 


The purpose of this thesis was to highlight the benefits and limitations of HUMS in 
its current state of development, and offer a methodology to begin exploring possible 
improvements. The limited scope of this thesis serves as an introduction to further study 
in the area of tree-structured classification applied to HUMS technology. 

This thesis used data from only one gear in the HTTF and developed models to 
characterize the structure of the data acquired from the two sensors near that gear. These 
models perform well within the confines of the data given. As was demonstrated with the 
data from the operational aircraft, the models do not necessarily provide reliable results 
when applied to data from a different test platform. This illustrates the need to acquire 
data that accurately characterizes that of an operational aircraft. 

Currently, the NAWC Trenton HTTF is the best source of data for applying this 
methodology and developing models to predict failure in aircraft components. Their 
ability to insert faulted components into an operational transmission enables them to 
develop and maintain a rich data set for tree-structured classification. A better source of 
data would obviously be data from the aircraft itself. Although data is available from the 
aircraft, it is of little value in characterizing the structure of faulted components, granted 
the aircraft has only good components. For obvious reasons, little data exists in which a 
faulted component is flown in an operational aircraft. Without this data, models that 
accurately predict the structure of aircraft data may be difficult to develop. 

A recommendation to move toward achieving this goal is installing HUMS on 
more operational aircraft. An accurately maintained base of aircraft data would enhance 
the capabilities of this research. Even though the majority of the data would be “no fault,” 
eventually a library of data would develop in which faults were believed to have been 
present in some of the acquired data. Assumptions would have to be made about when a 
certain fault occurred, and which acquisitions are believed to contain that fault. These 


assumptions would be determined in conjunction with the maintenance action that 
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discovered the fault. As this data base of faults developed, HUMS may become more 
accurate and begin predicting these faults. 

Further research is necessary to fully investigate the usefulness of tree-structured 
classification in HUMS. Analysis similar to the type done in this thesis should be done on 
numerous other gears, bearings and shafts in the HTTF. The models developed through 
this research will help determine the usefulness of this type of analysis to HUMS. 

In addition to further model development, there exists a need to continue to 
acquire data from the HTTF. To the greatest extent possible, the faulted components 
installed in the HTTF should be those that were discovered in post-flight maintenance 
inspections or at depot level maintenance. These faults that occur in the aircraft will help 
the tree-structured classification algorithm to more accurately characterize the structure of 
the operational aircraft. 

In this thesis, all the components were intentionally faulted rather than being 
components with fleet-rejyected faults. This may have been one of the reasons that the 
models did poorly when predicting the aircraft data. For instance, the edm fault 1s a small 
machined slit in the gear made to seed a crack propagation. This type of fault may never 
be found on an operational aircraft. If a pit in a gear exists in an operational aircraft, it is 
conceivable that it would remain in the aircraft, and be classified as “no fault” data. 
Installing faults discovered during maintenance would ensure that the fault would normally 
be found, and should adequately be characterized by tree-structured classification. 

This thesis demonstrated the usefulness of tree-structured classification in HUMS 
research. Still much needs to be done to prove its ability to accurately predict faults in 
operational aircraft. Since HUMS is in its infancy, it 1s reasonable to believe that 
methodology similar to that contained in this thesis will assist in its development and 


implementation. 
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APPENDIX A. SAMPLE OF DATA SET 


The entire data set is too large to reproduce in an appendix. The data set depicted is a 
sample of the entire data set showing six of the independent variables used in the final trees and 


examples of all levels of the dependent variable. 
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APPENDIX B. S-PLUS TREE SUMMARIES 


This appendix contains the S-Plus output for each tree model constructed. It contains 
the details of the tree. Each line of the tree has the node, the split that separated the cases, the 
deviance at that node, the classification of the node, and a vector with the probabilities of each 
case inthe node. An asterisk denotes a terminal node. Each tree corresponds to a figure in the 


text of the thesis. 
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POS) ERZAecsoe LS 2 UeooodOertault ( 0.863600 0.13640 ) * 
HOG) PRZA. 2s. 115° 0 Uevvgoosrault ( 1.000000 0.00000 ) * 
27) RAWRMS.1>24.75 1 a zouu0 no ( O0-2117600 0.88240 ) * 
(ees esere 045955 9 1lys0000 no { 0.147100 0.85290 ) 
ieirestasO0le ss 18-75000 no ( 0.043960 0.95600 ) * 
oy tn al 01 0 DecucoOsrault ( 1.000000 0.00000 ) * 
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APPENDIX C. S-PLUS CODE FOR HEURISTIC 


The following code produces ‘fer’ trees from data set ‘df’ using a stratified random 
sample of fifty percent of the data. Note that this code 1s not generic, in that the levels of the 
independent variable must be wmitten into the code with their appropriate order in the S-plus 
data frame. The fifty percent sample is coded using the size parameter of the sample function in 
S-plus. To modify this function to a use different data set or sample a different proportion of 
the data, the appropriate lines must be recoded. Explanation of code is preceded by # and 


follows the code it explains. 


function(daf = modell.dat, iter = 2) 

{ 
tree.misclass.vector <- vector(length = iter) 
predict.misclass.vector <- vector(length = iter) 


split.variable.vector <- vector(length = iter) 


# creates vectors to hold the IMR, PMR and first splitting variable for each 


# tree 


smallest.predict.error <- -1l 

smallest.fif.error <- -1 

for(count in l:iter) { 
nofault.sample <- sample(1:396, size = 198) 
edm.sample <- sample (397:582, size = 93) 
edmthree.sample <- sample (583:618, size = 18) 
tooth.sample <- sample(619:640, size = 11) 


# randomly samples half the data for each level of the dependent 


# variable 


tree.sim.full <- tree(df[c(nofault.sample, edm.sample, 
eamthree.sample, tooth.sample), jj) 


tree.sim <- prune.tree(tree.sim.full, best = 10) 


# grows and prunes tree from the randomly sampled data 


sts <- summary (tree.sim) 
tree .misclass.vector[count] <- sts$misclass([1]/sts$misclass [2] 


split.variable.vector[count] <- sts$used[1] 


# saves the TMR and first splitting variable into their respective vectors 


pt <- predict.tree(tree.sim, newdata = df[ - c(nofault.sample, 
edm.sample, edmthree.sample, tooth.sample),], type = 


"tree") 
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# applies remaining half of the data to the tree for prediction 


spt <- summary (pt) 


predict.misclass.vector[count] <- spt$misclass[1)/spt$misclass[2] 


# saves the PMR into its vector 


tree .predict.error <- predict.misclass.vector [count] 
tree .fif.error <- 0.5 * tree.misclass.vector(count) + 0.5 * 


predict.misclass.vector[count] 


# computes the two ‘measures of goodness’ 


if (smallest.predict.error < 0 || 
tree .predict.error < smallest.predict.error) { 
best.predict.tree <- tree.sim 
smallest.predict.error <- tree.predict.error 
best.predict.tmr <- tree.misclass.vector [count] 
best.predict.pmr <- predict.misclass.vector[count] 


best .predict.error <- tree.predict.error 


} 
# compares first ‘measure of goodness’ of current tree to ‘best’ and 


# saves current tree as best if applicable 


if(smallest.fif.error < 0 {| tree.fif.error < smallest.fif.error) { 
best.fif.tree <- tree.sim 
smallest.f£1f.error <= tree. fi1f error 
best.fif.tmr <- tree.misclass.vector[count] 
best.fif.pmr <- predict.misclass.vector [count] 


best.fif.error <- tree.fif.error 


} 
# compares second ‘measure of goodness’ of current tree to ‘best’ and 


# saves current tree as best if applicable 


} 


list(tmr = tree.misclass.vector, pmr = predict.misclass.vector, first = 


split.variable.vector, tree.fif =best.fif.tree, tree.predict 


best.predict.tree, tree.fif.tmr = best.fif.tmr, tree.fif.pmr 
best .fif.pmr, error.fif = best.fif.error, tree.predict.tmr = 
best.predict.tmr, tree.predict.pmr = best.predict.pmr, 


error.predict = best.predict.error) 
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